Fast algorithm for population-based protein structural model analysis.
نویسندگان
چکیده
De novo protein structure prediction often generates a large population of candidates (models), and then selects near-native models through clustering. Existing structural model clustering methods are time consuming due to pairwise distance calculation between models. In this paper, we present a novel method for fast model clustering without losing the clustering accuracy. Instead of the commonly used pairwise root mean square deviation and TM-score values, we propose two new distance measures, Dscore1 and Dscore2, based on the comparison of the protein distance matrices for describing the difference and the similarity among models, respectively. The analysis indicates that both the correlation between Dscore1 and root mean square deviation and the correlation between Dscore2 and TM-score are high. Compared to the existing methods with calculation time quadratic to the number of models, our Dscore1-based clustering achieves a linearly time complexity while obtaining almost the same accuracy for near-native model selection. By using Dscore2 to select representatives of clusters, we can further improve the quality of the representatives with little increase in computing time. In addition, for large size (~500 k) models, we can give a fast data visualization based on the Dscore distribution in seconds to minutes. Our method has been implemented in a package named MUFOLD-CL, available at http://mufold.org/clustering.php.
منابع مشابه
GENERALIZED FLEXIBILITY-BASED MODEL UPDATING APPROACH VIA DEMOCRATIC PARTICLE SWARM OPTIMIZATION ALGORITHM FOR STRUCTURAL DAMAGE PROGNOSIS
This paper presents a new model updating approach for structural damage localization and quantification. Based on the Modal Assurance Criterion (MAC), a new damage-sensitive cost function is introduced by employing the main diagonal and anti-diagonal members of the calculated Generalized Flexibility Matrix (GFM) for the monitored structure and its analytical model. Then, ...
متن کاملQuantification of Parkinson Tremor Intensity Based On EMG Signal Analysis Using Fast Orthogonal Search Algorithm
The tremor injury is one of the common symptoms of Parkinson's disease. The patients suffering from Parkinson's disease have difficulty in controlling their movements owing to tremor. The intensity of the disease can be determined through specifying the range of intensity values of involuntary tremor in Parkinson patients. The level of disease in patients is determined through an empirical rang...
متن کاملISOGEOMETRIC STRUCTURAL SHAPE OPTIMIZATION USING PARTICLE SWARM ALGORITHM
One primary problem in shape optimization of structures is making a robust link between design model (geometric description) and analysis model. This paper investigates the potential of Isogeometric Analysis (IGA) for solving this problem. The generic framework of shape optimization of structures is presented based on Isogeometric analysis. By discretization of domain via NURBS functions, the a...
متن کاملStructural Reliability: An Assessment Using a New and Efficient Two-Phase Method Based on Artificial Neural Network and a Harmony Search Algorithm
In this research, a two-phase algorithm based on the artificial neural network (ANN) and a harmony search (HS) algorithm has been developed with the aim of assessing the reliability of structures with implicit limit state functions. The proposed method involves the generation of datasets to be used specifically for training by Finite Element analysis, to establish an ANN model using a proven AN...
متن کاملA FAST GA-BASED METHOD FOR SOLVING TRUSS OPTIMIZATION PROBLEMS
Due to the complex structural issues and increasing number of design variables, a rather fast optimization algorithm to lead to a global swift convergence history without multiple attempts may be of major concern. Genetic Algorithm (GA) includes random numerical technique that is inspired by nature and is used to solve optimization problems. In this study, a novel GA method based on self-a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proteomics
دوره 13 2 شماره
صفحات -
تاریخ انتشار 2013